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Résumé : La gestion dynamique de ressources est un élément clé du paradigme de cloud Computing 
et plus récemment de celui de cloud networking. Dans ce contexte d'infrastructures virtualisées, 
la réduction des coûts associés à l'utilisation et à la ré-allocation des ressources contraint les opé- 
rateurs et les utilisateurs de clouds à une gestion rationnelle de celles-ci. Dans ce travail nous 
proposons une description probabiliste des besoins liée à la volatilité de la charge d'un service 
de distribution de vidéos à la demande. Cette description peut alors servir de consigne (input) 
à la provision et à l'allocation dynamique des ressources nécessaires. Notre approche repose sur 
la construction d'un modèle stochastique inspiré des modèles de Markov standards de propaga- 
tion épidémiologique, capable de reproduire des variations soudaines et intenses d'activité {buzz). 
Nous proposons alors une procédure heuristique d'identification du modèle à partir de séries tem- 
porelles du nombre d'utilisateurs connectés au serveur. Les performances d'estimation de chacun 
des paramètres du modèle sont évaluées numériquement, et nous vérifions l'adéquation du modèle 
aux données en comparant les distributions des états stationnaires ainsi que les fonctions d'auto- 
corrélation des processus. 

Les propriétés markoviennes de notre modèle garantissent qu'il vérifie un principe de grandes dé- 
viations permettant de caractériser statistiquement l'ampleur et la durée d'événements extrêmes 
et rares tels que ceux produits par les buzzs. C'est cette propriété que nous exploitons pour di- 
mensionner le volume de ressources (e.g. bande-passante, nombre de serveurs, taille de buffers) à 
prévoir pour réaliser un bon compromis entre coût de re-déploiement des infrastructures et qualité 
de service. Cette approche probabiliste de la gestion des ressources ouvre des perspectives sur les 
politiques de Service Level Agreement adaptées aux clouds et servant au mieux les intérêts des 
opérateurs de réseaux, de services et de leurs clients. 

Mots-clés : Réseaux, Cloud, Gestion probabiliste des Ressources, Modèles Epidémiques , Généra- 
teur de Charge, Estimation Statistique, Principe de Grandes Déviations, Service Level Agreement, 
Vidéo à la Demande, Buzz 



A Versatile Model for VoD Buzz Workload: Identification, 
Numerical Validation and Applications in Dynamic 
Resource Management 

Abstract: Dynamic resource management has become an active area of research in the Cloud 
Computing paradigm. Cost of resources varies significantly depending on configuration for using 
them. Hence efficient management of resources is of prime interest to both Cloud Providers and 
Cloud Users. In this report we suggest a probabilistic resource provisioning approach that can 
be exploited as the input of a dynamic resource management scheme. Using a Video on Demand 
use case to justify our claims, we propose an analytical model inspired from standard models 
developed for epidemiology spreading, to represent sudden and intense workload variations. As 
an essential step we also dérive a heuristic identification procédure to calibrate ail the model 
parameters and evaluate the performance of our estimator on synthetic time séries. We show 
how good can our model fit to real workload traces with respect to the stationary case in terms 
of steady-state probability and autocorrélation structure. We find that the resulting model vérifies 
a Large Déviation Principle that statistically characterizes extrême rare events, such as the ones 
produced by "buzz effects" that may cause workload overfiow in the VoD context. 
This analysis provides valuable insight on expectable abnormal behaviors of Systems. We exploit 
the information obtained using the Large Déviation Principle for the proposed Video on Demand 
use-case for defining policies (Service Level Agreements). We believe thèse policies for elastic 
resource provisioning and usage may be of some interest to ail stakeholders in the emerging 
context of cloud networking. 

Key-words: Cloud Networking, Probabilistic Resource Management, Epidémie Model, Work- 
load Generator, Statistical Estimation, Large Déviation Principle, Service Level Agreements, 
Video on Demand, Buzz 
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1 Introduction 

In récent trend of data-intensive applications with pay-as-you-go exécution in a cloud envi- 
ronment, there are new challenges in System management and design to optimize the resource 
utilization. Types of the application, deployed in a cloud, can be very diverse. Some applications 
exhibit highly varying demand in resources. In this paper we consider a Video on Demand (VoD) 
System as a relevant example of a data-intensive application where bandwidth usage varies ra- 
pidly over time. 

A VoD service delivers video contents to consumers on request. According to Internet usage 
trends, users are increasingly getting more involved in the VoD and this enthusiasm is likely to 
grow. According to 2010 statistics a popular VoD provider like Netflix accounts for around 30 
percent of the peak downstream traffic in the North America and is the "largest source of Internet 
traffic overall" pQ. Since VoD has stringent streaming rate requirements, each VoD provider needs 
to reserve a sufficient amount of server outgoing bandwidth to sustain continuous média delivery 
(we are not considering IP multicast here). However, resource réservation is very challenging in 
a situation, when a video becomes popular very quickly leading to a flood of user requests on 
the VoD servers. This situation, also known as a "buzz", demands an adaptive resource alloca- 
tion strategy to cope with the sudden (and significant) variation of workload. Following is one 
example of "buzz" (see Figure [T]) where interest over a video "Star Wars Kid" [2J grew very qui- 
ckly within a short timespan. According to [3] it was viewed more than 900 millions times within 
a short interval of time making it one of the top viral videos. Such bandwidth volatility créâtes 
significant challenges to meet, namely, both the desired QoS and efficient resource allocation. A 
sensible approach to this problem is to help the providers in better understanding and capturing 
the underlying characteristics of their applications. For example, if the information diffusion pro- 
cess follows a gossip- (or épidémie-) behavior, the rate at which a viewer keeps gossiping about 
a video and for how long (in average). 



Inria 



VoD Buzz Workload and its application in Dynamic Resource Management 



5 



Video Downloads (April 29 • July 29} 

500.000 







450,000 
400.000 
350.000 
300,000 
250.000 
ÎO0.00O 

iso.ooc 




































too.ooo 










D \f- r- 





Time (days) 

FIGURE 1 - Video server workload : time séries displaying a characteristic pattern of flash crowd (buzz 
effect). Trace obtained from [2]- 



In this report we follow a constructive approach to propose a stochastic épidémie workload ge- 
nerator for a VoD System based on a Markov model. We show that it succeeds to reproduce the 
traffic volatility, as exhibited in a real trace, including the buzz occurrence. But the principal 
interest of our model is that it vérifies a Large Déviation Principle (LDP) that gives a probabilis- 
tic description of the mean workload of the System over différent time scales. It thus adequately 
allows for statistically characterizing extrême rare events such as the ones produced by buzz 
transients. Our ultimate objective is to exploit this large déviation information as an input of 
an probabilistic resource management scheme. However, in order the proposed model to conform 
with this objective, it needs to be "identifiable" and easily calibrated on real data. The corres- 
ponding estimation procédure may not be trivial, since the VoD model is a non-parsimonious 
model and accounts for complex dynamics. In this report we propose a complète framework for 
the operators to identify the VoD model parameters based on a server workload trace 
After parameter estimation we devise two possible and generic ways to exploit the large dé- 
viation information in the context of probabilistic resource provisioning. They can serve as the 
input of resource management functionalities of the Cloud environment. It is évident that we can 
not defme elasticity without the notion of a time scale ; the Large Déviation Principle (LDP) is 
capable of automatically integrating the time resolution in automatic description of the System. 
It is to be noted that Markovian processes do satisfy the LDP, but so do some other models as 
well. Hence, our proposed probabilistic approach is very generic and can adapt to address any 
provisioning issues, provided the resource volatility can be resiliently represented by a stochastic 
process for which the LDP holds true. 
In a nutshell our contributions in this report include : 

- A Markov based versatile model to generate VoD workload, 

- A heuristic identification procédure for the proposed workload model, 

- A numerical évaluation of the estimator for each parameter of the model, 

- A real case study to assess the adequacy of our model to fit video workload traces, 

- An analysis of the Large Déviation property of the proposed Markovian model, 

- A discussion on the generic ways to exploit the large déviation information in the context 
of probabilistic resource provisioning. 
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Moreover, since we followed a constructive approach, each parameter of the model accounts for 
a spécifie component of the System, and so, its estimated value also permits to quantify the 
importance of the corresponding dynamic effect. 

Rest of the paper is organized as follows. In Section [5] we discuss the related works. We describe 
our model and further analyze it in Section [3] Section U outlines the parameter estimation pro- 
cédure and validâtes the procédure against synthetic workload traces. In Section [S] we validatc 
both our model and the estimation procédure against the real workload traces. Section [5] pré- 
sents Large Déviation Principle and numerical interprétations of the Large Déviation Spectrum. 
Section [7] deals with the probabilistic provisioning scheme, derived from the Large Déviation 
Spectrum for our use case. Finally we conclude and discuss future works in Section [S] 
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2 Related Work 

Information dissémination in a VoD System has been an active area of research. In [1] , it has 
been already demonstrated that the épidémie algorithms can be used as an effective solution 
for information dissémination in a VoD like P2P Systems. However, in this model an individual 
process must have a précise idea about the total number of processes in the System. Scalability 
is also another challenge that the authors addressed in this work. The authors of [S] studied ran- 
dom épidémie stratégies like the random peer, latest useful chunk algorithm to achieve optimal 
information dissémination. But main objective of this work is to demonstrate ways to achieve 
performance trade-offs using unstructured, épidémie live streaming Systems. However, it does 
not bring any information about the underlying dynamics of the streaming System. Authors of 
[ïïj similarly discussed an analytical framework for gossip protocols based on the pairwise infor- 
mation exchange between interacting nodes. However, this model only provides an analysis of 
networks with lossy channels. Another relevant work to our study is derived in [7] where the 
authors proposed an approach to predict workload for cloud clients. They considered an auto- 
scaling approach for resource provisioning and validated the resuit with real-world cloud client 
application traces. However, this work dépends on similar past occurrences of the current short- 
term workload history and is not appropriate to deal with sudden and short large variations of 
workload, as the ones produced by buzz effects. Authors of [5] show a statistical study of strea- 
ming traffic. They analyzed VoD session characteristics, amount and types of média delivered, 
popularity and daily access profile in order to develop a workload generator. However, the mo- 
del does not involve the dynamics of the process itself, ergo it is not naturally adapted to infer 
dynamic resource allocation stratégies. Authors of [S], [TU] and [TTj also develop user activity 
models to describe the usage of System resources. Limitation of thèse models are that they only 
give average results. However, dealing with mean workloads might not be sufhcient to clearly 
describe applications because of their potential volatility. In [T2| authors proposed a maximum 
likelihood method for fitting a Markov arrivai process (MAP) to the web traffic measurements, 
collected in commonly available HTTP web server traces. This method achieves reasonable accu- 
racy in prédictive models for web workloads but lacks intuitive nature to describe users behavior 
like a gossip based method. In [T3] the authors statistically model traffic volatility in large scale 
VoD Systems using GARCH (generalized autoregressive conditional heteroscedasticity) process. 
Amazon Cloud- Watch follows this approach and provides a free resource monitoring service to 
Amazon Web Service customers for a given frequency. Based on such estimâtes of future de- 
mand, each VoD provider can individually reserve a sufficient amount of bandwidth to satisfy 
in average its random future demand within a reasonable confidence. However, according to the 
authors, this technique only models and forecasts the mean demand, or the expected demand 
whereas the real demand might vary around this predicted mean. They suggested to provision 
an additional "risk premium" to the service providers for tolerating the demand fluctuation. In 
another workload model the authors of P3] [T5] proposed a Markov Modulated Poisson Process 
(MMPP) based approach for buzz modeling and then parameter estimation using the index of 
dispersion. However, the MMPP model includes only short-term memory in the System and the 
obtained statistics is not physically interprétable to draw inference about the System dynamics. 
The model we dérive in section [3] of this report has the following advantages : 

- It follows a constructive approach, based on a Markov model, 

- It is identifiable and succeeds to capture workload volatility, 

- It satisfies the large déviation properties, that can be exploited to frame dynamic resource 
allocation stratégies. 
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3 A VoD System and its modeling 

A VoD service delivers video contents to consumers on request. According to Internet usage 
trends, users are increasingly getting more involved in the VoD and this enthusiasm is likely to 
grow. A popular VoD provider like Netflix accounts for around 30 percent of the peak downstream 
traffic in the North America and is the "largest source of Internet traffic overall" [lj. In a VoD 
System, consumers are video clients who are connected to a Network Provider. The source video 
content is managed and distributed by a Service Provider from a central data centre. With 
the évolution of Cloud Computing and Networking, the service in a VoD System can be made 
more scalable by dynamically distributing the caching/transcoding servers across the network 
providers. Video service providers interact with the network service providers and describe the 
virtual infrastructures required to implement the service (like the number of servers requircd, 
their placements and clustering of resources) . The resource provider reserves resource for certain 
time period and may change it dynamically depending on resource requirement. Such a dynamic 
approach brings benefits of cost saving in the System through dynamic resource provisioning 
which is important for service providers as VoD workload is highly variable by nature. However, 
since the virtual resources used by Cloud Networking have a set-up time which is not negligible, 
analysis and provisioning of such a System can be very critical from the operators perspective 
(capex versus OPEX trade-off). Figure [2] shows a VoD schematic where the back-end server is 
connected to the data centre and the transcoding (caching) servers are placed across the network 
providers. Since VoD has stringent streaming rate requirements, each VoD provider needs to 



VoD Database & ' • 
Back-end server 




T: Transcoding Servers 
C: Video Clients 



FIGURE 2 - Basic schematics of a VoD System with transcoding/caching servers 

reserve a sufficient amount of server outgoing bandwidth to sustain continuous média delivery. 
When multiple VoD providers (such as Netflix) are on board to use cloud services from cloud 
providers, there will be a market between VoD providers and cloud providers, and commodities to 
be traded in such a market consist of bandwidth réservations, so that VoD streaming performance 
can be guaranteed. 

As a buyer in such a market, each VoD provider can periodically make réservations for 
bandwidth capacity to satisfy its random future demand. A simple way to achieve this is to 
estimate expectation and variance of its future demand using historical demand information, 
which can easily be obtained from cloud monitoring services. As an example, Amazon Cloud- 
Watch provides a free resource monitoring service to Amazon Web Service customers for a 
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given frequency. Based on such estimâtes of future demand, each VoD provider can individually 
reserve a sufficient amount of bandwidth to satisfy in average its random future demand within 
a reasonable confidence. However, this information is not helpful in case of a "buzz" or a "flash 
crowd" when a video becomes popular very quickly leading to a flood of user requests on the VoD 
servers. In situations like the one described in Figure [TJ variance estimation or more generally 
steady state distribution can not explain burstiness of such event as time resolution is excluded 
from the description. The LDP, by virtue of its multi-resolution extension of the classical steady- 
state distribution, can describe the dynamics of rare events like this, which we believe can be of 
some interest for the VoD service providers. 

3.1 Markov Model to describe the VoD user behavior 

Epidémie models commonly subdivide a population into several compartments : susceptible 
(noted S) to designate the persons who can get infected, and contagious (noted C) for the per- 
sons who have contracted the disease. This contagious class can further be categorized into two 
parts : the infected subclass (7) corresponding to the persons who are currently suffering from 
the disease and can spread it, and the recovered class (R) for those who got cured and do not 
spread the disease anymore [Ï6]- There can be more catégories that fall outside the scope of our 
current work. In thèse models (Ns(t))t>o, (Nj(t)) t >o and (Nft(t))t>o are stochastic processes 
representing the time évolution of susceptible, infected and recovered populations respectively. 
Similarly, information dissémination in a social network can be viewed as an épidémie spreading 
(through gossip), where the "buzz" is a spécial event where interest for some particular informa- 
tion increases drastically within a very short period of time. Following the lines of related works, 
we claim that the above mentioned épidémie models can appropriately be adapted to represent 
the way information spreads among the users in a VoD System. In the case of a VoD System, 
infected / refers to the people who are currently watching the video and can spread the infor- 
mation about it. In our setting, I directly represents the current workload which is the current 
aggregated video requests from the users. Here, we consider the workload as the total number 
of current viewers, but it can also refer to total bandwidth requested at the moment. The class 
R refers to the past viewers. In contrast to the classical épidémie case, we introduce a memory 
effect in our model, assuming that the R compartment can still propagate the gossip during a 
certain random latency period. Then, we define the probability within a small time interval dt, 
for a susceptible hidividual to turn into an active viewer, as follows : 



where (3 > is the rate of information dissémination per unit time and l > fixes the rate of 
spontaneous viewers. The instantaneous rate of newly active viewers in the System at time t is 
thus : 



Equation @ corresponds to the arrivai rate X(t) of a non-homogeneous (state dépendant) Poisson 
process. This rate varies linearly with Ni(t) and Nn(t). 

To complète our model we assume that the watch time of a video is exponentially distributed 
with rate 7. As already mentioned, it also deems reasonable to consider that a past viewer will 
not keep propagating the gossip about a video indefinitely, but remains active only for a latency 
random period that we also assume exponentially distributed with rate \x (in gênerai ((C7). 
Another important considération of the model is the maximum allowable viewers (Imax) at 
any instant of time. This assumption conforms to the fact that the resources in the System are 
physically limited. For the sake of numerical tractability and without loss of generality, we also 



Ps^c = (/+ (JV>(*) + N R {t j) (3)dt + (dt) 



(1) 



\(t) = l + (N I {t) + N R (t))p. 



(2) 
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assume the number of past (but spreading rumour) viewers at a given instant to be bounded 
by a maximum value (iîmax)- With thèse assumptions, and posing (iVj (£) = i,Nii(t) = r) the 
current state of the Markov processes, the probability that the process reaches a différent state 
[i 1 < J m ax,ï'' < iîmax) at time t + dt (dt being small) reads : 

F(i',r'\i,r) (3) 
= (1 + {i + r)P)dt + o(dt) for (i' = i + l,r' = r),E 
= (-fi)dt + o(dt) for (/ = r + = i - 1), 

= (fir)dt + o(dt) for (/ = r - = i), 

= o(dt) otherwise. 

This process defining the évolution of the current viewer and past viewer populations is a finite 
and irreducible Markov chain. It is to be noted that l > precludes the process to reach an 
absorbing state. This chain is ergodic and admits a stationary régime. 

Above mentioned descriptions define the mechanism of information dissémination in the com- 
munity in normal situations. A buzz event differs from this situation by a sudden increase of the 
dissémination rate /?. In order to adapt the model to buzz we resort to Hidden Markov Model 
(HMM) to be able to reproduce the change in /3. Without loss of generality we consider only two 
states. One with dissémination rate /3 — j3\ corresponds to the buzz-free case described above, 
and another hidden state corresponding to the buzz situation, where the value of /3 increases si- 
gnificantly and takes on a value p2 3> p\ . Transitions between thèse two hidden and memoryless 
Markov states occur with rates ai and a-i respectively (see Figure [3]). Thèse rates characterize 
the buzz in terms of frequency, magnitude and duration. 



/?(«'+(-)+/ 




7 / \ / \ 



p=p. 




FIGURE 3 - Markov chain diagram representing the évolution of the Current viewers (i) and Past 
Viewers (r) populations with a Hidden Markov Model. 

A closed-form expression for the steady state distribution of the workload (i) of this model is 
not trivial to dérive. However, we could easily express the analytic mean workload of the System 
solving the fiow balance équation, i.e. equaling the incoming and outgoing flow rates in steady 
régime. For ease, we start with (3 = fli = and generalize the resuit to /3i ^ /?2 thereafter. We 
get : 

E(0 = ~ R ô, (4) 

which, to be a positive and finite quantity, yields the stability criterion in buzz-free régime : 



r 1 >^+r 



(5) 



1. In a closed System, where the total number of viewers (susceptible, current and past) is constant, say N, 
the transition probability for (i' = i+l,r' = r) needs to be modified, since it would then dépend on the number 
of susceptible viewers as well,, i.e (N — i — r). The transition probability in this case would be (l + (i + r)£)(N - 
i — r)dt + o(dt). Therefore, Eq. [4] and [5] need to be modified accordingly. 
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We now extend thèse results to the case where the model may exhibit a buzz activity. As j3 
alternâtes between the hidden states /3 = fii and /3 = /3a, with respective state probabilities 
02/(01 + 02) and a\/(a\ + 02), one can simply replace /3 in Eq. (U) and ([5]) with the équivalent 
average value : 

ai + a 2 ai + a 2 

In order to illustrate the flexibility of our workload model and to validate Eq. (UJ) , we generate 
three synthetic traces corresponding to the différent sets of parameters verifying the stability 
condition of relation §5§ and reported in Table [T] Particular realizations of thèse processes ge- 
nerated over 2 21 points are displayed in Figure 2) While the synthetic traces corresponding to 



Table 1 - Parameters value used in the workload model to generate the three traces plotted 
in Fig. 2J The last two rows correspond to the theoretical mean workload of Eq. Q and to the 
sample mean value estimated from the traces. 
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case (b) 


case (c) 
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0.0032 


0.0032 


0.0032 
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0.0111 
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3.289 x 10~ 5 
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10- 4 


«1 
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ai 


0.0667 


0.0667 


0.0667 


E(i) 


1.92 


15.68 


44.72 


Emp. mean (i) 


1.74 


16.72 


45.23 



cases (b) and (c) reproduce distinct and easily identifiable buzz régimes, the parameter set of 
case (a) leads to a workload variation distinct from the typical shape of Figure [TJ Nonetheless, 
for ail 3 configurations, the empirical means estimated from the 2 21 samples of the traces are in 
good agreement with the expected values of Eq. 

Finally, let us notice that even though we consider exponentially distributed random variables in 
our model, any other distributions could be used, which, according to the same balance principle, 
would lead to a mean workload and to a stability condition of the same kind as (Ql and (J5]) • Ho- 
wever, the estimation procédure we dérive in the next section strongly relies on the exponential 
assumption and it would need to be thoroughly reworked to adapt to différent hypothèses. 



4 Estimation procédure 

In this section, we address the identifiability of our model and design a calibration algorithm 
to fit workload data. We start constructing empirical estimators for each parameter of the model 
and we numerically evaluate their performance on synthetic traces. 



4.1 Parameters estimation 

Considering a standard épidémie process X with propagation rate 9, the maximum likelihood 
estimate # M le is derived in [1(3] . |17| and reads : 

t \ - 1 

X(t) dt , (7) 
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case (a) case (b) case (c) 




FIGURE 4 - Illustration of our model ability at generating différent dynamics of workload I(t). See 
Table [T] for the parameter values corresponding to each of thèse three cases. The X— axis corresponds to 
time (in hours unit) while the F— axis indicates the number of active viewers. 



where n is the number of contaminations (i.e. number of incréments of X) occurring within the 
time interval T. 

Very often, maximum likelihood approach yields optimal results (in terms of estimate variance 
and or bias) but it is not always possible to get a closed-form expression for the estimated para- 
meters. This can either be due to the likelihood fonction that is impossible to dérive analytically, 
or to missing data that preclude straightforward application of the maximum likelihood principle. 
Nonetheless, solutions, such as the Expectation-Maximization (EM) or the Monte Carlo Markov 
Chain (MCMC) algorithms exist, which in some cases can approximate maximum likelihood 
estimators. 

Returning to our model depicted in Figure [3l each parameter needs to be empirically estima- 
ted, assuming that the instantaneous workload time séries is the only available observation. 

Watching parameter 7. As 7 is the departure rate of users that leave the infected state after they 
finished watching a video, it can directly be inferred from the number n of décréments of the 
observable process I(t). Therefore, the MLE of Eq. ([7]) straightforwardly applies and leads to : 




(8) 

Memory parameter fi. This rate at which past viewers leave the recovery compartment and stop 
propagating the virus (gossip), relates to the décrément density of the non-observed process R(t). 
It is thus impossible to simply apply the MLE of Eq. ([7]) unless we first construct a substitute 
R(t) to the missing data from the observable data set I(t). Let us recall that in our model, ail 
current viewers turn and remain contagious for a mean period of time 7 _1 + fi . Then, in first 
approximation, we can consider that R{t) dérives from the finite memory cumulative process : 



R(t) = / I(u)du, (9) 

which itself, dépends on the parameter to be estimated fi. We propose an estimation procédure 
based on the inhérent exponential property of the model. From the Poisson assumption, the 
inter-arrival time w between the consécutive arrivais of two new viewers is an exponentially 
distributed random variable such that E (w| I(t) + R(t) = x) = (f3 x + Its means that, for x 
fixed, the normalized random variable w = w/E(w|x) is exponentially distributed with unitary 
parameter and becomes independent of x. Ideally then, for each value of R(t) + I(t) = x, ail the 
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sub-series w x — {w n : R(t n )+I(t n ) — x}, after normalization by their own empirical mean, yield 
independent and identically distributed realizations of a unitary exponential random variable. 
In practice though, as R(t) is not observable, only if R(t) is accurately estimated, should this 
unitary exponential i.i.d. assumption hold true. From there, we propose the following algorithm : 
for différent values of fi spanning a reasonable interval, we use R^{t) estimated from Eq. @ 
to build the normalized séries w M . A statistical test applied to each allows for assessing the 
exponential i.i.d. hypothesis and then to select the value of [i that yield the best score. 
More concretely, we apply to w M = (w n ) n=1 N the statistical exponentially test derived in 

where 



[18] : Form the normalized spacings v M = («(„) = (N — n + l)(w(„) — w^ n _i) )J 



.N 



stands for w M rearranged in ascending order. Let F and G dénote the cumulative 
distribution fonctions of w^ and respectively, and compute the classical Kolmogorov-Smirnov 
distance : 

' - — ,10) 



sup \F(k)-G(k)\. 

Kk<N 



As F and G are identical for an exponentially i.i.d. random séries, we then expect T M to reach 
its minimum for the value of /x that gives the best estimate R^{t) of R(t) : 



/2 = argmin^ 
R = Rn- 



(H) 



Plots of Figure [5] show the évolution of the Kolmogorov-Smirnov distance corresponding to the 
traces displayed in Figure 2) In the 3 cases, T M clearly attains its minimum bound for fi close 
to to the actual value. The corresponding estimated processes R(t) derived from Eq. (fTTj) match 
fairly well the real évolution of the (R) class in our model (see Figure [6]). 
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FIGURE 5 - Evolution of the exponential test statistics (| 10p applied to the traces of Figure 2] Dotted 
vertical lines locate the actual value of /i for each case ; dot markers on each curve indicate the estimated 
value p, corresponding to the minimum point of the statistical test T M . 



Propagation parameters f3 and l. According to our model, the arrivai rate X(t) of new viewers is 
given by Eq. @. It linearly dépends on the current number of active and past viewers. So, from 
the observation I{t) and the reconstructed process R{t) of Eq. (jlip . we could formally apply the 
maximum likelihood Eq. ([7]) to estimate (3. In practice however, we have to bear in mind that : 
(i) the arrivai process of rate A(t) comprises a spontaneous viewers ingress that is governed by 
parameter l and which is independent of the current state of the System; (ii) depending on the 
current hidden state of the model (buzz-free versus buzz state), it is alternately f3 — (3\ and 
f3 = fi% that fix the propagation rate in Eq. ©. We designed an estimation procédure based on 
a weighted linear régression, that simultaneously addresses thèse two issues. We décompose our 



RR n° 8072 



14 



Roy & Begin & Loiseau & Goncalves 



900 




1498 1499 1500 1501 1502 1503 1504 

time (hrs) 

FIGURE 6 - Evolution of the number of active past viewers. Comparison of the actual (non observable) 
process R(t) (blue curve) with the estimated process R(t) (red curve) derived from expression ©. 



rationale in two steps : First, let us consider the buzz-free state only and /3 = As discussed in 
the estimation of /x the inter-arrival time w between the consécutive arrivais of two new viewers 
is an exponentially distributed random variable such that E (w| I(t) + R(t) = x) = (/3x + 1) . 
Concretely then, for différent values of the sum I(t) + R(t), we calculate the conditional empirical 
mean : 

Q(x) = -—- J2 Wn : T ( x ) = {*" : J (*«) + = ( 12 ) 

l±[X)l („ei(x) 

The linear régression of (f^x))" 1 against x yields at one go, both parameters estimation f3 (slope) 
and l (intercept). 

Let us now return to the gênerai form of our model with alternation of buzz and buzz-free 
periods. In the buzz-free case, /3 = /3\ corresponds to a normal workload activity, meaning that 
the sum lit) + R(t) takes on rather moderate values. Conversely, when the System undergoes 
a buzz, P = fa and the population I(t) + R(t) suddenly increases to reach significantly larger 
values. Yet, in both cases, the quantity fi -1 defined in Eq. (|12[) remains linear with x but with two 
différent régimes (slopes) depending on the amplitude of I(t)+R(t) — x. As a resuit, it is possible 
to reduce the bias that fi% causes on the estimation of fi\ , using a weighted linear régression of 
SI -1 vs x where the weights p(x) are proportional to the cardinal of the indicator sets I{x). 
Indeed, \I(x)\ should be smaller for larger values of x because buzz épisodes are expected to be 
less fréquent than nominal activity periods. Figure [7] confirms the claim : the plots (x, fi" 1 ) show 
a manifest linear trend with higher variability at x's large, meaning a fewer terms entered the 
sum of Eq. (TTSj). 

Formally, we can apply the exact same procédure to estimate fc, but considering opposite 
weights to favor the large values of x's. However, due to the large fluctuations of (f2(a;))" 1 in 
the corresponding région, the slope 02 is subject to a very poor estimation variance. Instead, we 
propose to apply the ML estimator described in Eq. ([7]) on the restriction of I(t) to the buzz 
periods only. Strictly speaking, we should consider R(t) as well, but since a buzz event normally 
occurs on very small interval of time, we assume that R(t) (resp. R(t)) remains constant in the 
meanwhile (flash crowd viewers will enter in R compartment only after the visualization time). 
In practice, to automatically identify the buzz periods, we threshold I(t) and consider only the 
persistent increasing parts that remain above the threshold. 

Transition rates ai and 02- As we already said, at time t, the inter-arrival time w separating to 
new incomers is a random variable drawn from an exponential law of parameter A = f3(i + r) + 1, 
where I(t) + R(t) — i + r and /3 is either equal to /?i or to 02- We dénote /i(w) and /2(w) the 
corresponding densities built upon the reconstructed process R(t) and the estimated parameters 
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FIGURE 7 - Weighted linear régression of Q 1 vs x corresponding to the three traces of Figure [4] 
Superimposed are the linear trends fitted on the respective data. 



, Z) and (/Î2,0 respectively. For a given inter-arrival time w = w n observed at time t n , we 
form the likelihood ratio f2(wn)/fi(w n ) to détermine whether the System is in buzz or in buzz- 
free state. Moreover, in order to avoid non-significant state transitions we resort to a restoration 
method inspired by the Viterbi algorithm [Ï9] . Once we have identified the hidden states of the 
process, we estimate the transitions rates ai and 02 from the average times spent in each state. 



4.2 Numerical Validation 

To evaluate the statistical performance of our estimation procédure, we resort to numerical 
experiments to empirically get the first and the second order moments of each parameter estima- 
tor. Owing to the versatility of our model, we must ensure that the proposed calibration algorithm 
perforais well for a variety of workload dynamics. To this end, we systematically reproduce the 
experiments considering the 3 sets of parameters reported in Table [TJ For each configuration, we 
generate 10 independent realizations of processes similar to the ones depicted in Figure |4l and 
use thèse to dérive descriptive statistics. 

The box-and-whisker plots of Figure [5] indicate for each estimated parameter (centered and 
normalized by the corresponding actual value) the sample médian (red line), the inter-quartile 
range (blue box height) along with the extrême samples (whiskers) obtained from time séries of 
length 2 21 points each. As expected (owing to the maximum likelihood procédure), estimation 
of 7 shows to be the most accurate, both in terms of bias and variance. But more surprisingly 
though, although the estimation j3\ dérives from a heuristic procédure that itself dépends on the 
raw approximation R(t) of Eq. @, the resulting performance is remarkably good : bias is always 
negligible (less than 5% in the worst case (c)) and the variance always confines to 10% of the 
actual value of /3i . Notice also that the estimation of fti goes from a slight underestimation in case 
(a) to a slight overestimation in case (c), as the buzz effect, i.e. the value of P2, grows from traces 
(a) to (b) . Compared to /3j , the estimation of (32 behaves more poorly and proves to be the most 
difncult parameter to estimate. But we have to keep in mind that this latter is only based on buzz 
periods which represent only a small fraction of the entire time séries. Regarding the parameter /i, 
its estimation remains within a 20% inter-quartile range but cases (a) and (c) show a systematic 
bias (médian hits the lower quartile bound). Let us then recall that the procédure, described by 
Eq. (jlip to détermine fi sélects within some discretized interval, the value of /i that yields the best 
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score. It is then very likely that the true value does not coincide with any sampled point of 
the interval and therefore, the procédure picks the closest one that systematically lies beneath or 
above. Finally, estimation of the transition parameters ai and ai between the two hidden states 
relies on ail other parameters estimation, cumulating so ail relative inaccuracies. Nonetheless and 
despite a systematic underestimating trend, précision remains within a very acceptable confidence 
interval. Convergence rate of the empirical estimators is another important feature that binds 
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FIGURE 8 - Box-and-Whisker plots for relative estimation errors of the model parameters for the three 
différent sets of prescribed parameters reported in Table fj] For each case (a)-(c), statistics are computed 
over 10 independent realizations of time séries of length 2 21 points. 



the estimate précision to the amount of available data. Using the same data set, the bar plots of 
Figure [9] depicts the évolution of the mean square error MSE(é>) = E{(0 — 6) 2 } - where generic 
6 stands for any parameter of the model - with the length N of the observable time séries I(t). 
As our purpose is to stress the rate of convergence of thèse quantifies towards zéro, to ease 
the comparison, we normalize the MSE of each parameter by its particular value at maximum 
data length (i.e 2 21 points here). Then, the estimator rate of convergence ag corresponds to the 
decaying slope of the MSE with respect to N in a log-log plot, i.e. MSE(0) ~ 0(iV" Q8 ). For the 
différent parameters of our model we obtain convergence rates that lie between ap t = 0.9 and 
a a2 = 0.2, leading each time to sub-optimal convergence (a g < 1). It is worth noticing that, 
despite its relatively ad hoc construction, the estimator of fi\ has an almost optimal convergence 
rate, which proves the rationality of our approach. 
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FIGURE 9 - Evolution of the Mean Square Error versus the data length TV in a log-log plot. For the sake 
of conciseness, we only show hère the results corresponding to the case (b) of Table [T] 

5 Validation of the estimation procédure against a real work- 
load trace 

We now apply the calibration procédure detailed in the previous section, to fit our model on 
real data and to assess the data-model adequacy in the context of highly volatile workloads. As a 
paradigm for variable demand applications, we use a VoD trace, released by the Greek Research 
and Technology Network (GRNET) VoD servers [5D]. Since the trace shows modest activity with 
a workload level that is not sufficient to stress-test our System, we scale up the data, shrinking 
ail inter-arrival times by a factor of 10. The resulting workload time séries displayed in Figure ITUl 
clearly shows two distinct periods of steady activity before and after the time index t = 200. We 
consider the sub-series on both sides of this cutoff time, as two individual workload processes, 
referred to as trace I and trace II respectively, and we calibrate our model on each of them 
separately. 

Results of the parameters estimation are reported in Tablej^l and we verified that in both cases 
the stability condition of Eq. §5§ was satisfied. In the same vein, we also compared the empirical 
mean workload of each trace with its corresponding theoretical value given by the formula fï]) . We 
obtain for trace I a relative différence of 12% (E(i) = 5.59 compared to (i) — 4.99), and of 12.5% 
for trace II (E(i) = 0.621 compared to (i) — 0.71). Naturally, the correspondence here is not as 
striking as it is with the synthetic traces of Section [3] But we must bear in mind that first, ab 
initia nothing guarantees that the underlyhig System matches our model dynamics and, second, 
traces I and II can possibly encompass short scale non-stationary periods (e.g. day versus night 
activity) which are not accounted for in our model. Notwithstanding this, the match we observe 
is quite satisfactory and we now focus on higher order statistics to further characterize the data 
model adequacy. As we do not have a closed-form for the steady state distribution of our Markov 
process model, nor we have an analytic expression for its autocorrélation function, we use the 
two sets of estimated parameters of Table [2] to synthesize two time séries that we compare to the 
real workload traces I and II. We refer to those synthetic traces as to the fitted traces I and II. 
The plots in Figure [TT] show the empirical steady state densities and the sample autocorrélation 
fonctions of both the real and the fitted traces. The superimposition of the différent curves 
is a clear évidence of our model ability at catching the statistical distribution of the number 
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FIGURE 10 - Real workload time séries corresponding to a VoD server demand from [20] . Initial trace 
was scaled up by a factor of 10 to increase the mean workload. Trace is chopped into two separate 
processes (Trace I and II) corresponding to différent activity levels. 



of current viewers along time. But also, and perhaps more importantly, it demonstrates that 
the dynamical mechanism underlying our constructive model is able to perfectly reproduce the 
temporal structure of the real traces, by imposing the correct statistical dependencies between 
distant samples I(t) and I{t + r) of the process. 

In addition to serve as a mean to evaluate the goodness-of-fit of our model, the estimated 
parameters bring on their own, a valuable insight about the System itself. For instance let us 
compare the propagation rates /3i and l estimated from traces I and II, successively. In the 
first case, fi\ < l, meaning that arrivai of new viewers is dominated by spontaneous incomers 
and is not so much due to information propagation through gossip. Conversely, (3\ > l for the 
second workload régime, indicating that the spontaneous attraction of the server has severely 
dropped whereas the peer-to-peer diffusion component significantly increased but not sufficiently 
to sustain the mean workload activity. At the same time, the index fi tripled, meaning that the 
mean memory period for propagation shrank by a factor of 3. This parameter could then be used 
as an indicator of the content interest delivered by the server, and of its lifetime in users mind. 



Table 2 - Estimated Parameters from traces I and II separately. 
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FIGURE 11 - Comparison of the empirical steady-state distribution and of the autocorrélation function 
of the real (blue curves) and the fitted (red curves) traces. Top two plots correspond to trace I and 
bottom plots correspond two trace II. 



6 Large Déviation Principle and its interprétation 

Consider a continuous-time Markov process {Xt)t>o, taking values in a imite state space S, of 
rate matrix A = (Aij)i e s,jes- bi our case X is a vectorial process X(t) — (N](t), Nn(t)) , Vi > 0, 
and S = {0, ■ • • , irnax} x {0, • • ■ , iîmax}- If the rate matrix A is irreducible, then the process 
X admits a unique steady-state distribution tt satisfying wA = 0. Moreover, by Birkhoff ergodic 
theorem, it is known that for any mapping $ : S — >• R, the sample mean of $(X) at scale r, 
i.e. 1/t • J Q $(X s )ds converges almost-surely towards the mean of &(X) under the steady-state 
distribution, as r tends to infinity. The function $ is often called the observable. In our case, 
as we are interested in the variations of the current number of users Nj(t), $ will simply be 
the function that sélects the first comportent : $(jVj(£), Nn(t)) — Nj(t). The large déviations 
principle (LDP), which holds for irreducible Markov processes on a finite state space |2T], gives 
a efficient way to estimate the probability for the sample mean calculated over a large period of 
time t to be around a value a G M that déviâtes from the almost-sure mean : 

lim lim - logP i f $(JT s )ds € [a - e, a + e] \ = f(a). (13) 

The mapping a i— > /(a) is called the large déviations spectrum (or the rate function). For a 
given function <ï>, it is possible to compute the theoretical large déviations spectrum from the 
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rate matrix A as follows. One first computes, for each values of g S R, the quantity A(q) defined 
as the principal eigenvalue (i.e., the largest) of the matrix with éléments Aij + q6ij<S>(j) (ay — 1 
if i = j and otherwise) . Then the large déviations spectrum can be computed as the Legendre 
transform of A : 

f(a) = sup{aa-A(o)},Va G R. (14) 

qem. 

As described in Eauation (ll3[) . a T = (i) T corresponds in our study case, to the mean number 
of users i observable over a period of time of length r and /(a) relates to the probability of its 
occurrence as follows : 

P{(*)r ~ ~ e T - /(Q) . (15) 

Interestingly also, if the process is strictly stationary (i.e. the initial distribution is invariant) 
the same large déviation spectrum /(•) can be estimated from a single trace, provided that it is 
"long enough" (22]. We proceed as follows : At a scale r, the trace is chopped into k T intervais 
{Ij.r = [(j — 1)t, jr[, j = 1, . . . , k T } of length r and we have (almost-surely) , for ail a G R : 

1 # {j ■ Si- ®{ X M s e [a - e T , a + e T ]| 
U (a, e T ) = - log ^ (16) 

and lim f T (a,e T ) = /(a). 

r— >oo 

In practice, for the empirical estimation of the large déviations spectrum, we use a similar 
estimator as the one derived in [53] and also used in [23]. At scale r, we compute for each q G R 
the values of A' T {q) and A^(q), where A T (ç) = r" 1 log ^/s^T 1 exp (q ^ $(X s )ds)^ . Then, 

for each value of r, we count the number of intervais Ij >T verifying the condition in expression (|16|) 
and estimate the scale-dependant empirical log-pdf f T (a, e T ), with the adaptive choices derived 
in [23] : 

a T = A' T {q) and e T = J Z^M. . (17) 

Let us now illustrate the LDP in the context of the spécifie VoD use case, where X would 
correspond to (i, r), the bi-variate Markov process. $(X) is i, the observable and JJ" &(Xs) ds — 
(i) T corresponds to the average number of users with a period r. 



6.1 Numerical Interprétations 

For ease of computation we estimate the Large Déviation Spectrum for cases where I max = 
30,i? max = 60. We also choose the parameters accordingly (so that it does not saturate with 
the maximum value) for buzz and buzz-free scénarios. For the first case /3\ — 0.1, fa = 0.8, 
7 = 0.7, /i = 0.3, l = 1.0, ai = 0.006 and a 2 = 0.6. For the buzz-free case : fa = /3 2 = /3 = 0.1, 
7 = 0.7, /i = 0.3, / = 1.0. Intrinsically, Large Déviation Principle naturally embeds the time scale 
notion into the statistical description of the aggregated observable at différent time resolutions. 
As expected, the theoretical LD spectra displayed in Figure IT2T a) reach their maximum for the 
same mean number of users. This apex is the almost sure value as described in Section ??. As 
the name suggests almost sure workload (a a . s ) corresponds to the mean value that we almost 
surely observe on the trace. More interestingly though, the LD spectrum corresponding to the 
buzz case, spans over a much larger interval of observable mean workloads than that of the 
buzz-free case. This remarkable support widening of the theoretical spectrum shows that LDP 
can accurately quantify the occurrence of extrême, yet rare events. 
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FIGURE 12 - Large Déviations spectra corresponding to the traces of Figure [4] (a) Theoretical spectra 
for the buzz free (blue) and for the buzz (red) scenarii. (b) & (c) Empirical estimations of /(a) at 
différent scales from the buzz free and the buzz traces, respectively. 

Plots (b)-(c) of Figure [T2l compare theoretical and empirical large déviation spectra obtained 
for the two traces. For each given scale (r) the empirical estimation procédure yields one LD 
estimate. Thèse empirical estimâtes at différent scales superimpose for a given range of a. This 
is reminiscent of the scale invariant property underlying the large déviation principle. If we focus 
on the supports of the différent estimated spectra, the larger the time scale r is, the smaller 
becomes the interval of observable value of a. This is cohérent with the fact that for a finite 
trace-length the probability to observe a number of current viewers, that in average, déviâtes 
from the nominal value (a a . s ) during a period of time (r) decreases exponentially fast with r. To 
fix the ideas, the estimâtes of plot (c), indicate that for a time scale t = 400 sec, the maximum 
observable mean number of users is around 5 with probability 2 400 '(~ - 02 ) sa 35. 10~ 5 (point A), 
while it increases up to 9 with the same probability (2 100 (-° 08 )) for t = 100sec. (point B). 
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7 Resource management policies 

Retuning to our VoD use case, we now sketch two possible schemes for exploiting the Large 
Déviation description of the System to dynamically provision the allocated resources : 

- Identification of the reactive time scale for reconfiguration : Find a relevant time scale that 
realizes a good trade-off between the expectable level of overflow associated to this scale 
and a sustainable OPEX cost induced by the resources reconfiguration needed to cope with 
the corresponding flash crowd. 

- Link capacity dimensioning : Considering a maximum admissible loss probability, find the 
safety margin that it is necessary to provision on the link capacity. to guarantee the cor- 
responding QoS. 

7.1 Identification of the reactive time scale for reconfiguration 

We consider the case of a VoD service provider who wants to détermine the reactivity scale at 
which it needs to reconfigure its resource allocation. This quantity should clearly dérive from a 
good compromise between the level of congestion (or losses) it is ready to undergo, i.e. a tolerable 
performance dégradation, and the price it is willing to pay for a fréquent reconfiguration of its 
infrastructure. Let us then assume that the VoD provider has fixed admissible bounds for thèse 
two competing factors, having determined the following quantifies : 

- a* > a a . s . : the déviation threshold beyond which it becomes worth (or mandatory) consi- 
dering to reconfigure the resource allocation. This choice is uniquely determined by a CAPEX 
performance concern. 

- o~* '. an acceptable probability of occurrence of thèse overflows. This choice is essentially 
guided by the corresponding OPEX cost. 

Let us moreover suppose, that the LD spectrum f(a) of the workload process was previously 
estimated, either by identifying the parameters of the Markov model used to describe the ap- 
plication, or empirically from collected traces. Then, recalling the probabilistic interprétation 
we surmised in relation ([Tïïj). the minimum reconfiguration time scale r* for dynamic resource 
allocation, that vérifies the sought compromise, is simply the solution of the following inequality : 



with f T (a) as defined in expression (fTïï|) . 

From a more gênerai perspective though, we can see this problem as an underdetermincd 
System involving 3 unknowns (a*,r* and er*) and only one relation (|18p . Therefore, and depending 
on the sought objectives, we can imagine to fix any other two of thèse variables and to détermine 
the resulting third so that it abides with the same inequality as in expression (|18[) . 

7.2 Link capacity dimensioning 

We now consider an architecture dimensioning problem from the infrastructure provider pers- 
pective. Let us assume that the infrastructure and the service providers have corne to a Service 
Level Agreement (SLA), which among other things, fixes a tolerable level of losses due to link 
congestion. We start considering the case of a single VoD server and address the following ques- 
tion : What is the minimum link capacity C that has to be provisioned such that we meet the 
negotiated QoS in terms of loss probability ? Like in the previous case, we assume that the estima- 
ted LD spectrum /(a) characterizing the application has been priorly identified. A rudimentary 
SLA would be to guarantee a loss free transmission for the normal traffic load only : this loose 




(18) 
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QoS would simply amount to fix C to the almost sure workload a a . s .. Naturally then, any load 
overflow beyond this value will resuit in goodput limitation (or losses, if there is no buffer to 
smooth out exceeding loads). For a more demanding QoS, we are led to détermine the necessary 
safety margin Cq > one has to provision above a a . s . to absorb the exact amount of overruns 
corresponding to the loss probability pi oss that was negotiated in the SLA. From the interpréta- 
tion of the large déviation spectrum provided in Section ??, this margin Co is determined by the 
resolution of the following inequality : 

C Q : I I e r /(Q) drda < 

■da < pioss (19) 



Ploss 
aa.s. +C Jr min 



e T max -f{a) _ e T min -f{a) 



a,. s .+Co 



/(«) 



In this expression, r m i„ is typically determined by the size Q of the buffers that is usually 
provisioned to dampen the trahie volatility. In that case, 

Q (20) 



111111 / i /~t \ ' 

OL - \Ci a .s. + C ) 

corresponds to the maximum burst duration that can be buffered without causing any loss at 
rate a > C — a a . s . + Co< As for r max , it relates to the maximum period of réservation dedicated 
to the application. Most often though, the characteristic time scale of the application exceeds 
the dynamic scale of flash crowds by several orders of magnitude, and r max can then simply be 
set to infinity. With thèse particular intégration bounds, Equation (fl9|) simplifies to 

/oo 



a decreasing function of C, which can be solved using a simple bisection technique. 
As long as the server workload remains below C, this resource dimensioning guarantees that no 
loss occurs. AU overrun above this value will produce losses, but we ensure that the frequency 
(probability) and duration of thèse overruns are such that the loss rate remains conformed to 
the SLA. The proposed approach clearly contrasts with resource over-provisioning that does not 
seek at optimizing the CAPEX to comply with the loss probability tolerated in the SLA. 

The same provisioning scheme can straightforwardly be generalized to the case of several 
applications sharing a common set of resources. To fix the idea, let us consider an infrastructure 
provider that wants to host K VoD servers over the same shared link. A corollary question is 
then to détermine how many servers K can the fixed link capacity C support, while guaranteeing 
a prescribed level of losses. If the servers are independent, the probability for two of them to 
undergo a flash crowd simultaneously is negligible. For ease and without loss of generality, we 
moreover suppose that they are identically distributed and modeled by the same LD spectrum 
/( fc )( a ) = f{a) w ith the same nominal workload ot^l. — a a . s ., k — 1, . . . K. Then, following the 
same reasoning as in the previous case of a single server, the maximum number K of servers 
reads : 



K = a,igm&x(C - K ■ a a . s .) < C* , (22) 

where the safety margin Co is defined as in expression (pH]) . 

Then, depending on the agreed Service Level Agreements, the infrastructure provider can 
easily offer différent levels of probability losses (QoS) to its VoD clients, and adapt the number 
of hosted servers, accordingly. 
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FIGURE 13 - Dimensioning K, the number of hosted servers sharing a fixed capacity link C. The safety 
margin Co is determined according to the probabilistic loss rate negotiated in the Service Level Agreement 
between the infrastructure provider and the VoD service provider. 



8 Conclusion 

Many applications deployed on a cloud infrastructure, such as a Video on Demand service, are 
well known for undergoing highly volatile demand, making their workload hard to qualitatively 
and quantitatively characterize. Adopting a constructive approach to capture the VoD users' 
behavior, in this report we proposed a simple, concise and versatile model for generating the 
workload variations in such context. We also devised an heuristic identification procédure that 
aims at estimating the parameters values of the model from a single collected trace. First, we 
numerically evaluated the accuracy of this procédure using several synthetic traces. Our experi- 
ments show that the procédure introduces little bias and typically recovers the actual parameters 
value with a relative error of about 10%. Second, we apply this same procédure against two real 
video server workload traces. Obtained results demonstrate that, once the model has been cali- 
brated, it succeeds to reproduce the statistical behavior of the real trace (in terms of both the 
steady-state probabilities and the autocorrélations for the workload time séries). Moreover, owing 
to the constructive nature of our model, the estimated values of the parameters provide valuable 
insight on the application that it would be difficult, or even impossible, to infer from the raw 
traces. The captured information may answer questions of practical interest to cloud oriented 
service providers, like : is the application workload mostly driven by spontaneous behaviors, or 
is it rather subject to a gossip phenomenon ? 

Furthermore, a key-point of this model is that it permits to reproduce the workload time séries 
with a Markovian process, which is known to verify a Large Déviation Principle (LDP). This 
particularly interesting property yields a large déviation spectrum whose interprétation enriches 
the information conveyed by the standard steady state distribution : For a given observation 
(workload trace), LDP allows to infer (theoretically and empirically) the probability that the 
time average workload, calculated at an arbitrary aggregation scale, déviâtes from its nominal 
value (i.e. almost sure value). 

We leveraged this multiresolution probabilistic description to conceptualize two différent ma- 
nagement schemes for dynamic resource provisioning. As explained, the rationale is to use large 
déviation information to help network and service providers together to agrée on the best CAPEX- 
OPEX trade-off. Two major stakes of this negotiation are : (i) to détermine the largest reconfi- 
guration time scale adapted to the workload elasticity and (ii) to dimension VoD server so as 
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to guarantee with upmost probability the Quality of Service imposed by the negotiated Service 
Level Agreement. 

More generally though, the same LDP based concepts can benefit any other "Service on Demand" 
scenarii to be deployed on dynamic cloud environments. 
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